Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search

نویسندگان

David W. Embley

Stephen W. Liddle

Deryle W. Lonsdale

Yuri A. Tijerino

چکیده

Valuable local information is often available on the web, but encoded in a foreign language that non-local users do not understand. Can we create a system to allow a user to query in language L1 for facts in a web page written in language L2? We propose a suite of multilingual extraction ontologies as a solution to this problem. We ground extraction ontologies in each language of interest, and we map both the data and the metadata among the language-specific extraction ontologies. The mappings are through a central, language-agnostic ontology that allows new languages to be added by only having to provide one mapping rather than one for each language pair. Results from an implemented early prototype demonstrate the feasibility of cross-language information extraction and semantic search. Further, results from an experimental evaluation of ontology-based query translation and extraction accuracy are remarkably good given the complexity of the problem and the complications of its implementation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Extraction Ontologies

The growth of multilingual web content and increasing internationalization portends the need for cross-language query processing. We offer ML-OntoES (a MultiLingual Ontology-based Extraction System) as a solution for narrowdomain/data-rich applications. Based on language-independent extraction ontologies (Embley, Liddle, & Lonsdale, 2011), ML-OntoES enables semantic search over domain-specific,...

متن کامل

Cross-Language Hybrid Keyword and Semantic Search

The growth of multilingual web content and increasing internationalization portends the need for cross-language information retrieval. As a solution to this problem for narrow-domain, data-rich web content, we offer ML-HyKSS: MultiLingual Hybrid Keyword and Semantic Search. The key component of ML-HyKSS is a collection of linguistically grounded conceptual-model instances called extraction onto...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Cross-Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories

This paper describes a computational linguistics-based approach for providing interoperability between multi-lingual systems in order to overcome crucial issues like cross-language and cross-collection retrieval. Our proposal is a system which improves capabilities of language-technology-based information extraction. In the last few years various theories have been developed and applied for mak...

متن کامل

Multi-word processing in an ontology-based Cross-Language Information Retrieval model for specific domain collections

This paper proposes a methodological approach to CLIR applications for the development of a system which improves multi-word processing when specific domain translation is required. The system is based on a multilingual ontology, which can improve both translation and retrieval accuracy and effectiveness. The proposed framework allows mapping data and metadata among language-specific ontologies...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search

نویسندگان

چکیده

منابع مشابه

Multilingual Extraction Ontologies

Cross-Language Hybrid Keyword and Semantic Search

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Cross-Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories

Multi-word processing in an ontology-based Cross-Language Information Retrieval model for specific domain collections

عنوان ژورنال:

اشتراک گذاری